1] E.m. Kleinberg, an Overtraining-resistant Stochastic Modeling Method for Pattern Recognition

نویسندگان

  • Robert Haralick
  • David Ittner
  • George Nagy
چکیده

17 a similar reason, we did not nd the method of projection pursuit 5] in exploratory data analysis useful in this context. 4 Conclusions We described a method for inferring from the training data faithful but concise representations of the empirical class-conditional distributions. In doing this, we have abandoned many usual simplifying assumptions about the distributions: e.g. that they are simply-connected, unimodal, convex, or parametric (e.g. Gaussian). We have shown that a classiier can be constructed using a metric deened on these distribution maps. Our method requires unusually large and representative training sets, which we provided through pseudo-random generation of training samples using a realistic model of printing and imaging distortions. We illustrated the method on a challenging recognition problem: 3755 character classes of machine-print Chinese, in four typefaces, over a range of text sizes: in a test on over three million images, the perfect-metric classiier achieved better than 99% top-choice accuracy. In addition, we showed that it is superior to a conventional parametric classiier using comparable resources. The reject behavior, permitted by the distinctive distributions of distances to correct or incorrect classes, is also very reliable. We have also shown a way to construct similar feature transformations and classiiers for arbitrary domains. The features and the metric were derived with minimum heuristics. Moreover, the classiier's accuracy can be improved with additional training data. In this study we have concentrated on linear mappings as features. Future studies may focus on investigating other families of features, especially those that can be used under weaker conditions on the class-conditional distributions. Acknowledgements We would like to thank Eugene Kleinberg for the communications of his theory, and Pavlidis for their helpful comments. The font descriptions used in the experiments were supplied by William Sun along with his software packagèGB2PS' distributed on Internet. We are thankful for his contribution.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Simple Implementation of the Stochastic Discrimination for Pattern Recognition

The method of stochastic discrimination (SD) introduced by Kleinberg ([6,7])is a new method in pattern recognition. It works by producing weak classifiers and then combining them via the Central Limit Theorem to form a strong classifier. SD is overtraining-resistant, has a high convergence rate, and can work quite well in practice. However, some strict assumptions involved in SD and the difficu...

متن کامل

A Concrete Statistical Realization of Kleinberg’s Stochastic Discrimination for Pattern Recognition. Part I. Two-class Classification By

The method of stochastic discrimination (SD) introduced by Kleinberg is a new method in statistical pattern recognition. It works by producing many weak classifiers and then combining them to form a strong classifier. However, the strict mathematical assumptions in Kleinberg [The Annals of Statistics 24 (1996) 2319–2349] are rarely met in practice. This paper provides an applicable way to reali...

متن کامل

On the Algorithmic Implementation of Stochastic Discrimination

ÐStochastic discrimination is a general methodology for constructing classifiers appropriate for pattern recognition. It is based on combining arbitrary numbers of very weak components, which are usually generated by some pseudorandom process, and it has the property that the very complex and accurate classifiers produced in this way retain the ability, characteristic of their weak component pi...

متن کامل

Overtraining in neural networks that interpret clinical data.

Backpropagation neural networks are a computer-based pattern-recognition method that has been applied to the interpretation of clinical data. Unlike rule-based pattern recognition, backpropagation networks learn by being repetitively trained with examples of the patterns to be differentiated. We describe and analyze the phenomenon of overtraining in backpropagation networks. Overtraining refers...

متن کامل

Energy-Based Source Tracking and Motion Pattern Recognition Using Acoustic Sensor Networks

Acoustic sensor networks can be used for localization of an acoustic-energy emitting source. While maximum-likelihood (ML) methods are widely used for estimating the pattern of motion, more advanced machine learning schemes should be employed for improving the accuracy of localization. In this paper, we develop a learning Bayesian tracking algorithm that is capable of reconstructing the target ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998